Skip to content

Add guide on choosing entity_id and entity_uri for HERD references#2206

Closed
bendichter wants to merge 3 commits into
devfrom
docs-herd-entity-id-uri-best-practices
Closed

Add guide on choosing entity_id and entity_uri for HERD references#2206
bendichter wants to merge 3 commits into
devfrom
docs-herd-entity-id-uri-best-practices

Conversation

@bendichter

Copy link
Copy Markdown
Collaborator

Motivation

Follow-up to the HERD tutorial discussion in #2200 and the June 23 CN+LBNL sync. Users adding external resource references with HERD.add_ref need clear guidance on what to put in the entity_id and entity_uri fields, which is currently undocumented and was a recurring point of confusion (e.g. which of the many NCBITaxon / taxonomy / NCBI_TAXON forms to use, and what to do for atlases without per-term URLs).

This adds a narrative reference page (docs/source/external_resources_entity_guide.rst, in the Resources toctree).

Guidance

  • entity_id should be a CURIE (prefix:identifier) whose prefix is registered with bioregistry.io. The Bioregistry maps each CURIE to a canonical, resolvable URL and disambiguates the many overlapping identifier schemes.

  • entity_uri should be the URL the CURIE resolves to, which you can look up via https://bioregistry.io/<entity_id>.

  • A table of commonly used registries with verified example entity_identity_uri pairs:

    Prefix Use for Example entity_id Example entity_uri
    NCBITaxon Species NCBITaxon:10090 http://purl.obolibrary.org/obo/NCBITaxon_10090
    ROR Organizations ROR:013meh722 https://ror.org/013meh722
    ORCID People ORCID:0000-0002-1825-0097 https://orcid.org/0000-0002-1825-0097
    UBERON Brain regions (cross-species) UBERON:0001950 http://purl.obolibrary.org/obo/UBERON_0001950
    MBA Allen Mouse Brain Atlas MBA:385 https://purl.brain-bican.org/ontology/mbao/MBA_385
    HBA Allen Human Brain Atlas HBA:4005 https://purl.brain-bican.org/ontology/hbao/HBA_4005
    DANDI Dandisets DANDI:000015 https://dandiarchive.org/dandiset/000015

    (Each resolved URL was verified against the Bioregistry API.)

  • Fallback for resources without per-term URLs (e.g. the macaque D99 atlas): put the resource's overall URL in entity_uri and the term's atlas-specific ID in entity_id, so every reference still dereferences to something authoritative.

Notes

  • All example entity_uri values are inline literals (not hyperlinks), so the -W linkcheck CI does not probe them; only the two stable homepages (bioregistry.io, the W3C CURIE spec) are linked. Cross-references to hdmf.common.resources.HERD / add_ref resolve via the existing hdmf intersphinx mapping.
  • I couldn't do a full local docs build to confirm rendering because this checkout's nwb-schema submodule is in a modified state (build fails at config with No specification for 'BaseImage'), unrelated to this change. RST structure (table columns, title underlines) was validated separately.

🤖 Generated with Claude Code

Add a documentation page explaining how to populate the entity_id and
entity_uri fields when adding HERD external resource references:

- entity_id should be a CURIE (prefix:identifier) whose prefix is
  registered with bioregistry.io, which maps it to a canonical resolvable
  URL and avoids ambiguity between overlapping identifier schemes.
- entity_uri should be the URL the CURIE resolves to (lookupable via
  https://bioregistry.io/<entity_id>).
- Includes a table of commonly used registries (NCBITaxon, ROR, ORCID,
  UBERON, MBA, HBA, DANDI) with example entity_id/entity_uri pairs.
- Documents the fallback for resources without per-term URLs (e.g. the D99
  macaque atlas): put the resource URL in entity_uri and the term's
  atlas-specific ID in entity_id.

Adds the page to the Resources toctree.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@codecov

codecov Bot commented Jun 23, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.29%. Comparing base (2e38a0c) to head (3186906).

Additional details and impacted files
@@           Coverage Diff           @@
##              dev    #2206   +/-   ##
=======================================
  Coverage   95.29%   95.29%           
=======================================
  Files          30       30           
  Lines        3038     3038           
  Branches      450      450           
=======================================
  Hits         2895     2895           
  Misses         87       87           
  Partials       56       56           
Flag Coverage Δ
integration 73.14% <ø> (ø)
unit 85.97% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bendichter bendichter marked this pull request as ready for review June 23, 2026 20:23
@bendichter

Copy link
Copy Markdown
Collaborator Author

Comment thread docs/source/external_resources_entity_guide.rst
Comment thread docs/source/external_resources_entity_guide.rst
Per review feedback, map each general concept to the NWB fields it commonly
annotates (e.g. species -> Subject.species, people -> NWBFile.experimenter,
brain regions -> ElectrodeGroup.location / ImagingPlane.location), so users
can connect a concept to where it appears in an NWB file.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@oruebel

oruebel commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Thanks @bendichter! This is very helpful! This seems to lay the foundation for a more general guide on external resources. I am wondering whether this may be more appropriate for the https://nwb-overview.readthedocs.io website to provide an entry point from there into the topic of external resources? I think this would probably be it's own top-level page, but I think it would also be useful to add a section to the conversion guide https://nwb-overview.readthedocs.io/en/latest/conversion_tutorial/user_guide.html# What do you think?

@bendichter

Copy link
Copy Markdown
Collaborator Author

yes, I agree it could go there. That way it can be a reference for MATLAB and Python users.

@oruebel

oruebel commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

yes, I agree it could go there. That way it can be a reference for MATLAB and Python users.

I think this is a great start as is, so I think we can just move it to nwb-overview and then keep adding to it as we go in additional issues/PRs.

The UBERON, MBA, and HBA rows all mapped to the same location fields. Replace
the repeated list with a footnote reference so the field list
(ElectrodeGroup.location, ImagingPlane.location, electrodes location column)
is written once. list-table cannot span cells, so a shared footnote is the
cleanest way to avoid the repetition.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@bendichter

Copy link
Copy Markdown
Collaborator Author

closed in favor of nwb-overview

@bendichter bendichter closed this Jun 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants